Using R to Orchestrate APIs

A presentation for Research Data at the Edge, Day One of Duke Research Computing Symposium

Hosted by the Data & Visualization Services Department.

The Files

The presentation materials were composed in Rmarkdown via Rstudio, stored in a Github Repository, Slides & Notebook served via Github Pages.

Outline

Why?

The Web has lots of stuff

  • frontier beyond curated datasets
  • stuff is wrapped in HTML
  • HTML is transported over HTTP but composed for h2m consumption
  • Intellectual Property rights bear serious consideration

API

Application Program Interface

  • Built for machine-to-machine interactions
  • Instructions for programs

Client / Server

  • Make [R] interface with the web
  • Same as h2m but now m2m

Human Simulation

A dramatization…

  • Person uses Web Client
    • Person enters a URL

    • client & server negotiate
      dramatization: good handshake
    • Information is sent back in wrapped HTML
    • Web Browser parses the HTML

m2m – development

dramatization: confused about the protocol

dramatization: confused about the protocol

JSON

# from https://en.wikipedia.org/wiki/JSON
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 25,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}

Example

To Follow Along

  1. Open an RStudio Docker Container - https://vm-manage.oit.duke.edu/containers/rstudio
  2. Project > New Project
  3. Version Contrl > Git
  4. Repository URL = https://github.com/libjohn/r-api-json.git > Create Project
  5. Open API-JSON-Symposium.Rmd file
    • Run All
    • GoTo Line 150-ish (“### Demonstration”)

Demonstration

library(jsonlite)
# https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html
# for building tibbles
library(tidyverse)

Single JSON array

When the server response is a single JSON array, JSONlite makes viewing the data pretty simple.

oneJSONresult <- fromJSON("http://www.omdbapi.com/?t=rocky&y=&plot=full&r=json")

Let’s see the results in the next slide


oneJSONresult
$Title
[1] "Rocky"

$Year
[1] "1976"

$Rated
[1] "PG"

$Released
[1] "03 Dec 1976"

$Runtime
[1] "120 min"

$Genre
[1] "Drama, Sport"

$Director
[1] "John G. Avildsen"

$Writer
[1] "Sylvester Stallone"

$Actors
[1] "Sylvester Stallone, Talia Shire, Burt Young, Carl Weathers"

$Plot
[1] "Rocky Balboa is a struggling boxer trying to make the big time, working as a debt collector for a pittance. When heavyweight champion Apollo Creed visits Philadelphia, his managers want to set up an exhibition match between Creed and a struggling boxer, touting the fight as a chance for a \"nobody\" to become a \"somebody\". The match is supposed to be easily won by Creed, but someone forgot to tell Rocky, who sees this as his only shot at the big time."

$Language
[1] "English"

$Country
[1] "USA"

$Awards
[1] "Won 3 Oscars. Another 16 wins & 21 nominations."

$Poster
[1] "https://images-na.ssl-images-amazon.com/images/M/MV5BMTY5MDMzODUyOF5BMl5BanBnXkFtZTcwMTQ3NTMyNA@@._V1_SX300.jpg"

$Metascore
[1] "N/A"

$imdbRating
[1] "8.1"

$imdbVotes
[1] "387,927"

$imdbID
[1] "tt0075148"

$Type
[1] "movie"

$Response
[1] "True"

The vector object behaves as you would expect in R.
  • You can list all the variable names.
names(oneJSONresult)
 [1] "Title"      "Year"       "Rated"      "Released"   "Runtime"    "Genre"      "Director"   "Writer"     "Actors"    
[10] "Plot"       "Language"   "Country"    "Awards"     "Poster"     "Metascore"  "imdbRating" "imdbVotes"  "imdbID"    
[19] "Type"       "Response"  
  • List an individual element
oneJSONresult$Title
[1] "Rocky"
oneJSONresult$Awards
[1] "Won 3 Oscars. Another 16 wins & 21 nominations."

A JSON Matrix

The results of this code-snippet react differently between the console, the Notebook script (console), and the Notebook HTML output. In the Notebook script-output you can find the component name, in this case dollar-search: $Search. Or, you can use bracket notation: [[1]]. Once you identify the component name, it’s easier to identify the element names.

jsonSeriesResutlsMatrix <- fromJSON("http://www.omdbapi.com/?s=rocky&type=series&r=json&page=1")
jsonSeriesResutlsMatrix
$Search

$totalResults
[1] "20"

$Response
[1] "True"

Call the search results and coerce the JSON array into a data frame.

jsonSeriesResutlsMatrix$Search

jsonSeriesResutlsMatrix$Search$Title
 [1] "Rocky and His Friends"         "Dr. Jeff: Rocky Mountain Vet"  "Rocky Jones, Space Ranger"     "Rocky Mountain Law"           
 [5] "Rocky King, Detective"         "Rocky Road"                    "Rocky Mountain Bounty Hunters" "Rocky + Drago"                
 [9] "Rocky Point"                   "Rocky Star"                   

Resources

LS0tDQp0aXRsZTogIlVzaW5nIFIgdG8gT3JjaGVzdHJhdGUgQVBJcyINCmF1dGhvcjogIkpvaG4gTGl0dGxlIg0KZGF0ZTogJ2ByIFN5cy5EYXRlKClgJw0Kb3V0cHV0Og0KICBzbGlkeV9wcmVzZW50YXRpb246IGRlZmF1bHQNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KLS0tDQojIyBVc2luZyBSIHRvIE9yY2hlc3RyYXRlIEFQSXMNCg0KQSBwcmVzZW50YXRpb24gZm9yIFtSZXNlYXJjaCBEYXRhIGF0IHRoZSBFZGdlXShodHRwOi8vbGlicmFyeS5kdWtlLmVkdS9lZGdlL2V2ZW50cy9yYzE3KSwgRGF5IE9uZSBvZiBbRHVrZSBSZXNlYXJjaCBDb21wdXRpbmcgU3ltcG9zaXVtXShodHRwczovL3JjLmR1a2UuZWR1L3N5bXBvc2l1bS0yMDE3LykNCg0KSG9zdGVkIGJ5IHRoZSBbRGF0YSAmIFZpc3VhbGl6YXRpb24gU2VydmljZXNdKGh0dHA6Ly9saWJyYXJ5LmR1a2UuZWR1L2RhdGEvKSBEZXBhcnRtZW50LiAgDQoNCiMjIyBUaGUgRmlsZXMNCi0gZ2l0aHViIFJlcG8gLS0gaHR0cHM6Ly9naXRodWIuY29tL2xpYmpvaG4vci1hcGktanNvbiANCi0gU2xpZGVzIC0tIGh0dHBzOi8vbGliam9obi5naXRodWIuY29tL3JjczIwMTcvc2xpZGVzLmh0bWwNCi0gTm90ZWJvb2sgLS0gaHR0cDovL2xpYmpvaG4uZ2l0aHViLmlvL3JjczIwMTcvbm90ZWJvb2suaHRtbCANCg0KVGhlIHByZXNlbnRhdGlvbiBtYXRlcmlhbHMgd2VyZSBjb21wb3NlZCBpbiAqUm1hcmtkb3duKiB2aWEgKlJzdHVkaW8qLCBzdG9yZWQgaW4gYSAqR2l0aHViIFJlcG9zaXRvcnkqLCBTbGlkZXMgJiBOb3RlYm9vayBzZXJ2ZWQgdmlhICpHaXRodWIgUGFnZXMqLiAgDQoNCg0KDQojIyBPdXRsaW5lDQoNCiogQVBJDQoqIEpTT04NCiogUiAvIFJTdHVkaW8NCg0KIyMgV2h5Pw0KDQojIyMgVGhlIFdlYiBoYXMgbG90cyBvZiBzdHVmZg0KKyBmcm9udGllciBiZXlvbmQgY3VyYXRlZCBkYXRhc2V0cw0KKyBzdHVmZiBpcyB3cmFwcGVkIGluIEhUTUwNCisgSFRNTCBpcyB0cmFuc3BvcnRlZCBvdmVyIEhUVFAgYnV0IGNvbXBvc2VkIGZvciBoMm0gY29uc3VtcHRpb24NCisgSW50ZWxsZWN0dWFsIFByb3BlcnR5IHJpZ2h0cyBiZWFyIHNlcmlvdXMgY29uc2lkZXJhdGlvbg0KDQo8IS0tIE5BU0EgYW5pbWF0ZWQgR0lGIC8vLyAgaHR0cDovL2kuZ2lwaHkuY29tL2wySmh0NGxJZkVRZkozemoyLmdpZiAgICAtLT4gDQo8IS0tICBnb29kIGh1bWFuIGhhbmRzaGFrZSAvLy8gIGh0dHA6Ly9naXBoeS5jb20vZ2lmcy90aG9tYXMtVTJYYm9SdU44OUlkaSAtLT4NCjwhLS0gYWZ0ZXIgdGhlIHJlc2VhcmNoIGhhbmRzaGFrZSBpcyBjb21wbGV0ZSAvLy8gaHR0cDovL2dpcGh5LmNvbS9naWZzLzgwcy0xOTgwcy10aG9tYXMtZG9sYnktd0NLbUJkN29OdEE0ZyAgLS0+IA0KPCEtLSB0aGUgY29uZnVzaW9uIG9mIHRoZSBtMm0gaGFuZHNoYWtlIC8vLyAgIGh0dHA6Ly9naXBoeS5jb20vZ2lmcy90aG9tYXMtTWprQ1lqTTQ2TnJyTyAtLT4NCg0KIyMgQVBJDQoNCiMjIyBBcHBsaWNhdGlvbiBQcm9ncmFtIEludGVyZmFjZSANCg0KKiBCdWlsdCBmb3IgbWFjaGluZS10by1tYWNoaW5lIGludGVyYWN0aW9ucw0KKiBJbnN0cnVjdGlvbnMgZm9yIHByb2dyYW1zDQoNCjwhLS0gaHR0cDovL21vYmlsZS1ncHMubmV0LzIwMTUvMDEvIC0tPg0KIVtdKGltYWdlcy9hcGkucG5nKQ0KDQoNCi0tLSAgICANCg0KIyMjIENsaWVudCAvIFNlcnZlciANCg0KDQohW10oaW1hZ2VzL0NsaWVudC1zZXJ2ZXItbW9kZWwuc3ZnLnBuZykgDQoNCiogTWFrZSBbUl0gaW50ZXJmYWNlIHdpdGggdGhlIHdlYg0KKiBTYW1lIGFzIGgybSBidXQgbm93IG0ybQ0KDQoNCjwhLS0gaHR0cHM6Ly9waXhhYmF5LmNvbS9lbi9jbGllbnQtc2VydmVyLW5ldHdvcmtpbmctbGFwdG9wLTM0MTQyMC8gLS0+DQotLS0gIA0KDQojIyMgSHVtYW4gU2ltdWxhdGlvbg0KDQojIyMjIEEgZHJhbWF0aXphdGlvbi4uLg0KDQoqIFBlcnNvbiB1c2VzIFdlYiBDbGllbnQNCiAgICArIFBlcnNvbiBlbnRlcnMgYSBVUkw8YnI+DQogICAgIVtdKGltYWdlcy9VUkwuUE5HKQ0KICAgIA0KICAgICsgY2xpZW50ICYgc2VydmVyIG5lZ290aWF0ZTxicj4gDQogICAgIVtkcmFtYXRpemF0aW9uOiBnb29kIGhhbmRzaGFrZV0oaW1hZ2VzL2dvb2QtaGFuZHNoYWtlLmdpZikgDQogICAgKyBJbmZvcm1hdGlvbiBpcyBzZW50IGJhY2sgaW4gd3JhcHBlZCBIVE1MDQogICAgKyBXZWIgQnJvd3NlciBwYXJzZXMgdGhlIEhUTUwgDQogICAgDQo8IS0tIGh0dHBzOi8vY29tbW9ucy53aWtpbWVkaWEub3JnL3dpa2kvRmlsZTpVbmlmb3JtX1Jlc291cmNlX0xvY2F0b3JfKFVSTClfZXhhbXBsZS5QTkcgLS0+DQo8IS0tIGh0dHBzOi8vY29tbW9ucy53aWtpbWVkaWEub3JnL3dpa2kvRmlsZTpIVE1MLnN2ZyAtLT4NCg0KIyMgbTJtIC0tIGRldmVsb3BtZW50DQoNCg0KIVtkcmFtYXRpemF0aW9uOiBjb25mdXNlZCBhYm91dCB0aGUgcHJvdG9jb2xdKGltYWdlcy9kZXZlbG9wbWVudC1jb25mdXNpb24uZ2lmKQ0KICAgIA0KIyMgSlNPTg0KDQoqIFtKYXZhc2NyaXB0IE9iamVjdCBOb3RhdGlvbl0oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvSlNPTikgaXMgYSBsYW5ndWFnZS1pbmRlcGVuZGVudCBkYXRhIGZvcm1hdA0KKiBDdXJyZW50bHkgdGhlIG1vc3QgY29tbW9uIGRhdGEgZGF0YSBmb3JtYXQgZm9yIGFzeW5jaHJvbm91cyBjbGllbnQvc2VydmVyIGNvbW11bmljYXRpb24gZm9ybWF0DQoqIENvbnNpc3RzIG9mIGtleS12YWx1ZSBwYWlycw0KDQo8IS0tIGh0dHA6Ly9pLnZpbWVvY2RuLmNvbS92aWRlby81NDE5MzU4MTZfMTI4MHg3MjAuanBnIC0tPg0KPCEtLSBWaW1lbyBvbiBXaGF0IGlzIEpTT04gLy8gaHR0cHM6Ly92aW1lby5jb20vMTQ0MTYyMTAyIC0tPg0KDQoNCmBgYHtqc29uIGV4YW1wbGV9DQojIGZyb20gaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvSlNPTg0Kew0KICAiZmlyc3ROYW1lIjogIkpvaG4iLA0KICAibGFzdE5hbWUiOiAiU21pdGgiLA0KICAiaXNBbGl2ZSI6IHRydWUsDQogICJhZ2UiOiAyNSwNCiAgImFkZHJlc3MiOiB7DQogICAgInN0cmVldEFkZHJlc3MiOiAiMjEgMm5kIFN0cmVldCIsDQogICAgImNpdHkiOiAiTmV3IFlvcmsiLA0KICAgICJzdGF0ZSI6ICJOWSIsDQogICAgInBvc3RhbENvZGUiOiAiMTAwMjEtMzEwMCINCiAgfSwNCiAgInBob25lTnVtYmVycyI6IFsNCiAgICB7DQogICAgICAidHlwZSI6ICJob21lIiwNCiAgICAgICJudW1iZXIiOiAiMjEyIDU1NS0xMjM0Ig0KICAgIH0sDQogICAgew0KICAgICAgInR5cGUiOiAib2ZmaWNlIiwNCiAgICAgICJudW1iZXIiOiAiNjQ2IDU1NS00NTY3Ig0KICAgIH0sDQogICAgew0KICAgICAgInR5cGUiOiAibW9iaWxlIiwNCiAgICAgICJudW1iZXIiOiAiMTIzIDQ1Ni03ODkwIg0KICAgIH0NCiAgXSwNCiAgImNoaWxkcmVuIjogW10sDQogICJzcG91c2UiOiBudWxsDQp9DQpgYGANCg0KDQojIyBFeGFtcGxlDQoNCiMjIyBUbyBGb2xsb3cgQWxvbmcNCjEuIE9wZW4gYW4gUlN0dWRpbyBEb2NrZXIgQ29udGFpbmVyIC0gaHR0cHM6Ly92bS1tYW5hZ2Uub2l0LmR1a2UuZWR1L2NvbnRhaW5lcnMvcnN0dWRpbyANCjIuIFByb2plY3QgPiBOZXcgUHJvamVjdA0KMy4gVmVyc2lvbiBDb250cmwgPiBHaXQgDQo0LiBSZXBvc2l0b3J5IFVSTCA9IGh0dHBzOi8vZ2l0aHViLmNvbS9saWJqb2huL3ItYXBpLWpzb24uZ2l0ID4gQ3JlYXRlIFByb2plY3QgDQo1LiBPcGVuICpBUEktSlNPTi1TeW1wb3NpdW0uUm1kKiBmaWxlDQogICAgKyBSdW4gQWxsDQogICAgKyBHb1RvIExpbmUgMTUwLWlzaCAoIiMjIyBEZW1vbnN0cmF0aW9uIikgDQoNCi0tLSANCg0KIyMjIE9NREIgYXBpIA0KDQotIGh0dHA6Ly93d3cub21kYi5vcmcvDQogICAgLSBsaWtlIGh0dHA6Ly9pbWRiLmNvbS8NCi0gbm8gQVBJIGtleXMgcmVxdXJpZWQNCi0gaHR0cDovL3d3dy5vbWRiYXBpLmNvbS8NCg0KLS0tIA0KDQojIyMgRGVtb25zdHJhdGlvbg0KDQoNCmBgYHtyIGxvYWQtbGlicmFyeS1wYWNrYWdlLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPVRSVUV9DQpsaWJyYXJ5KGpzb25saXRlKQ0KIyBodHRwczovL2NyYW4uci1wcm9qZWN0Lm9yZy93ZWIvcGFja2FnZXMvanNvbmxpdGUvdmlnbmV0dGVzL2pzb24tYWFxdWlja3N0YXJ0Lmh0bWwNCg0KIyBmb3IgYnVpbGRpbmcgdGliYmxlcw0KbGlicmFyeSh0aWR5dmVyc2UpDQpgYGANCg0KDQojIyMgU2luZ2xlIEpTT04gYXJyYXkNCldoZW4gdGhlIHNlcnZlciByZXNwb25zZSBpcyBhIHNpbmdsZSBKU09OIGFycmF5LCBKU09ObGl0ZSBtYWtlcyB2aWV3aW5nIHRoZSBkYXRhIHByZXR0eSBzaW1wbGUuDQpgYGB7ciBzaW5nbGVKU09OcmVzdWx0fQ0Kb25lSlNPTnJlc3VsdCA8LSBmcm9tSlNPTigiaHR0cDovL3d3dy5vbWRiYXBpLmNvbS8/dD1yb2NreSZ5PSZwbG90PWZ1bGwmcj1qc29uIikNCmBgYA0KDQpMZXQncyBzZWUgdGhlIHJlc3VsdHMgaW4gdGhlIG5leHQgc2xpZGUNCg0KLS0tDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdA0KYGBgDQoNCg0KLS0tIA0KDQojIyMjIyBUaGUgdmVjdG9yIG9iamVjdCBiZWhhdmVzIGFzIHlvdSB3b3VsZCBleHBlY3QgaW4gUi4gIA0KDQotIFlvdSBjYW4gbGlzdCBhbGwgdGhlIHZhcmlhYmxlIG5hbWVzLg0KDQpgYGB7cn0NCm5hbWVzKG9uZUpTT05yZXN1bHQpDQpgYGANCg0KLSBMaXN0IGFuIGluZGl2aWR1YWwgZWxlbWVudA0KDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdCRUaXRsZQ0KYGBgDQoNCmBgYHtyfQ0Kb25lSlNPTnJlc3VsdCRBd2FyZHMNCmBgYA0KDQoNCi0tLQ0KDQojIyMgQSBKU09OIE1hdHJpeA0KVGhlICoqcmVzdWx0cyBvZiB0aGlzIGNvZGUtc25pcHBldCByZWFjdCBkaWZmZXJlbnRseSoqIGJldHdlZW4gdGhlICpjb25zb2xlKiwgdGhlICpOb3RlYm9vayBzY3JpcHQqIChjb25zb2xlKSwgYW5kIHRoZSAqTm90ZWJvb2sgSFRNTCogb3V0cHV0LiAgSW4gdGhlIE5vdGVib29rIHNjcmlwdC1vdXRwdXQgeW91IGNhbiBmaW5kIHRoZSBjb21wb25lbnQgbmFtZSwgaW4gdGhpcyBjYXNlIGRvbGxhci1zZWFyY2g6IGAkU2VhcmNoYC4gIE9yLCB5b3UgY2FuIHVzZSBicmFja2V0IG5vdGF0aW9uOiBgW1sxXV1gLiAgT25jZSB5b3UgaWRlbnRpZnkgdGhlIGNvbXBvbmVudCBuYW1lLCBpdCdzIGVhc2llciB0byBpZGVudGlmeSB0aGUgZWxlbWVudCBuYW1lcy4NCmBgYHtyfQ0KanNvblNlcmllc1Jlc3V0bHNNYXRyaXggPC0gZnJvbUpTT04oImh0dHA6Ly93d3cub21kYmFwaS5jb20vP3M9cm9ja3kmdHlwZT1zZXJpZXMmcj1qc29uJnBhZ2U9MSIpDQpqc29uU2VyaWVzUmVzdXRsc01hdHJpeA0KYGBgDQoNCi0tLSAgDQoNCiMjIyBDYWxsIHRoZSBzZWFyY2ggcmVzdWx0cyBhbmQgY29lcmNlIHRoZSBKU09OIGFycmF5IGludG8gYSBkYXRhIGZyYW1lLg0KYGBge3J9DQpqc29uU2VyaWVzUmVzdXRsc01hdHJpeCRTZWFyY2gNCmBgYA0KDQotLS0gDQpgYGB7cn0NCmpzb25TZXJpZXNSZXN1dGxzTWF0cml4JFNlYXJjaCRUaXRsZQ0KYGBgDQoNCg0KIyMgUiBQYWNrYWdlcyAtLSBSZWxhdGVkDQoNCipQZW9wbGUgd2hvIHVzZSBKU09ObGl0ZSBhbHNvIHVzZS4uLioNCg0KKiBbaHR0Ul0oaHR0cHM6Ly9jcmFuLnItcHJvamVjdC5vcmcvd2ViL3BhY2thZ2VzL2h0dHIvKSAtLSBjYWxscyBKU09ObGl0ZSBpbiBzZXJ2aWNlIHRvIG1ham9yIGdvYWwgb2Ygb3JjaGVzdHJhdGluZyBIVFRQICh3ZWIgc2NyYXBpbmcpDQoqIFtydmVzdF0oaHR0cHM6Ly9ibG9nLnJzdHVkaW8ub3JnLzIwMTQvMTEvMjQvcnZlc3QtZWFzeS13ZWItc2NyYXBpbmctd2l0aC1yLykgLS0gIHVzZWQgZm9yIEhUTUwgcGFyc2luZw0KDQojIyBSZXNvdXJjZXMgDQoNCi0gUlN0dWRpbyBodHRSIHZpZGVvDQotIEpTT05saXRlIHBhY2thZ2UNCi0gbGlzdG9mIGltYWdlcw0KLSBNb3ZpZXMgb2YgMTk3Ng0KICAgIC0gW09NREIgVG9wIE1vdmllc10oaHR0cDovL3d3dy5vbWRiLm9yZy9lbmN5Y2xvcGVkaWEveWVhci8xOTc2L3N0YXRpc3RpY3MpDQogICAgLSBbSU1EQiBNb3N0IFBvcHVsYXJdKGh0dHA6Ly93d3cuaW1kYi5jb20veWVhci8xOTc2LykNCg0K